318 research outputs found
An Improved Composite Hypothesis Test for Markov Models with Applications in Network Anomaly Detection
Recent work has proposed the use of a composite hypothesis Hoeffding test for
statistical anomaly detection. Setting an appropriate threshold for the test
given a desired false alarm probability involves approximating the false alarm
probability. To that end, a large deviations asymptotic is typically used
which, however, often results in an inaccurate setting of the threshold,
especially for relatively small sample sizes. This, in turn, results in an
anomaly detection test that does not control well for false alarms. In this
paper, we develop a tighter approximation using the Central Limit Theorem (CLT)
under Markovian assumptions. We apply our result to a network anomaly detection
application and demonstrate its advantages over earlier work.Comment: 6 pages, 6 figures; final version for CDC 201
Botnet Detection using Social Graph Analysis
Signature-based botnet detection methods identify botnets by recognizing
Command and Control (C\&C) traffic and can be ineffective for botnets that use
new and sophisticate mechanisms for such communications. To address these
limitations, we propose a novel botnet detection method that analyzes the
social relationships among nodes. The method consists of two stages: (i)
anomaly detection in an "interaction" graph among nodes using large deviations
results on the degree distribution, and (ii) community detection in a social
"correlation" graph whose edges connect nodes with highly correlated
communications. The latter stage uses a refined modularity measure and
formulates the problem as a non-convex optimization problem for which
appropriate relaxation strategies are developed. We apply our method to
real-world botnet traffic and compare its performance with other community
detection methods. The results show that our approach works effectively and the
refined modularity measure improves the detection accuracy.Comment: 7 pages. Allerton Conferenc
Outlier detection using distributionally robust optimization under the Wasserstein metric
We present a Distributionally Robust Optimization (DRO) approach to outlier detection in a linear regression setting, where the closeness of probability distributions is measured using the Wasserstein metric. Training samples contaminated with outliers skew the regression plane computed by least squares and thus impede outlier detection. Classical approaches, such as robust regression, remedy this problem by downweighting the contribution of atypical data points. In contrast, our Wasserstein DRO approach hedges against a family of distributions that are close to the empirical distribution. We show that the resulting formulation encompasses a class of models, which include the regularized Least Absolute Deviation (LAD) as a special case. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior, and the other concerns the discrepancy between the estimated and true regression planes. Extensive numerical results demonstrate the superiority of our approach to both robust regression and the regularized LAD in terms of estimation accuracy and outlier detection rates
Robust measurement-based buffer overflow probability estimators for QoS provisioning and traffic anomaly prediction applicationm
Suitable estimators for a class of Large Deviation approximations of rare
event probabilities based on sample realizations of random processes have been
proposed in our earlier work. These estimators are expressed as non-linear
multi-dimensional optimization problems of a special structure. In this paper,
we develop an algorithm to solve these optimization problems very efficiently
based on their characteristic structure. After discussing the nature of the
objective function and constraint set and their peculiarities, we provide a
formal proof that the developed algorithm is guaranteed to always converge. The
existence of efficient and provably convergent algorithms for solving these
problems is a prerequisite for using the proposed estimators in real time
problems such as call admission control, adaptive modulation and coding with
QoS constraints, and traffic anomaly detection in high data rate communication
networks
Robust measurement-based buffer overflow probability estimators for QoS provisioning and traffic anomaly prediction applications
Suitable estimators for a class of Large Deviation approximations of rare event probabilities based on sample realizations of random processes have been proposed in our earlier work. These estimators are expressed as non-linear multi-dimensional optimization problems of a special structure. In this paper, we develop an algorithm to solve these optimization problems very efficiently based on their characteristic structure. After discussing the nature of the objective function and constraint set and their peculiarities, we provide a formal proof that the developed algorithm is guaranteed to always converge. The existence of efficient and provably convergent algorithms for solving these problems is a prerequisite for using the proposed estimators in real time problems such as call admission control, adaptive modulation and coding with QoS constraints, and traffic anomaly detection in high data rate communication networks
Robust Anomaly Detection in Dynamic Networks
We propose two robust methods for anomaly detection in dynamic networks in
which the properties of normal traffic are time-varying. We formulate the
robust anomaly detection problem as a binary composite hypothesis testing
problem and propose two methods: a model-free and a model-based one, leveraging
techniques from the theory of large deviations. Both methods require a family
of Probability Laws (PLs) that represent normal properties of traffic. We
devise a two-step procedure to estimate this family of PLs. We compare the
performance of our robust methods and their vanilla counterparts, which assume
that normal traffic is stationary, on a network with a diurnal normal pattern
and a common anomaly related to data exfiltration. Simulation results show that
our robust methods perform better than their vanilla counterparts in dynamic
networks.Comment: 6 pages. MED conferenc
- …